Locally Determining the Number of Neighbors in the k-Nearest Neighbor Rule Based on Statistical Confidence

نویسندگان

  • Jigang Wang
  • Predrag Neskovic
  • Leon N. Cooper
چکیده

The k-nearest neighbor rule is one of the most attractive pattern classification algorithms. In practice, the value of k is usually determined by the cross-validation method. In this work, we propose a new method that locally determines the number of nearest neighbors based on the concept of statistical confidence. We define the confidence associated with decisions that are made by the majority rule from a finite number of observations and use it as a criterion to determine the number of nearest neighbors needed. The new algorithm is tested on several realworld datasets and yields results comparable to those obtained by the knearest neighbor rule. In contrast to the k-nearest neighbor rule that uses a fixed number of nearest neighbors throughout the feature space, our method locally adjusts the number of neighbors until a satisfactory level of confidence is reached. In addition, the statistical confidence provides a natural way to balance the trade-off between the reject rate and the error rate by excluding patterns that have low confidence levels.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence

The k-nearest-neighbor rule is one of the most attractive pattern classification algorithms. In practice, the choice of k is determined by the cross-validation method. In this work, we propose a new method for neighborhood size selection that is based on the concept of statistical confidence. We define the confidence associated with a decision that is made by the majority rule from a finite num...

متن کامل

A Statistical Confidence-Based Adaptive Nearest Neighbor Algorithm for Pattern Classification

The k-nearest neighbor rule is one of the simplest and most attractive pattern classification algorithms. It can be interpreted as an empirical Bayes classifier based on the estimated a posteriori probabilities from the k nearest neighbors. The performance of the k-nearest neighbor rule relies on the locally constant a posteriori probability assumption. This assumption, however, becomes problem...

متن کامل

A New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection

Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...

متن کامل

Optimized Nearest Neighbor Methods Cam Weighted Distance & Statistical Confidence

Nearest neighbor classification methods are a useful and a relatively straightforward to implement classification technique. However, despite such appeal, they still suffer from the curse of dimensionality. Additionally, the nature of the data sets may not be wholly applicable to the model assumed in the nearest neighbor methods. As such there have been many proposed optimizations. Two such opt...

متن کامل

Estimation of Density using Plotless Density Estimator Criteria in Arasbaran Forest

    Sampling methods have a theoretical basis and should be operational in different forests; therefore selecting an appropriate sampling method is effective for accurate estimation of forest characteristics. The purpose of this study was to estimate the stand density (number per hectare) in Arasbaran forest using a variety of the plotless density estimators of the nearest neighbors sampling me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005